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EXECUTIVE SUMMARY 


Mapping and sequencing the genomes of higher plants is a_ high 
priority for agricultural research. Such knowledge will provide 
the means to understand plant processes and provide the know-how 
to genetically engineer plants to resolve’ agricultural problems ~ 
Such as improved growth, better resistance to biotic and _ abiotic 
stresses and enhanced quality. 


The goals of a USDA plant genome mapping and sequencing initiative 
should be: 


TO PROVIDE A FOUNDATION OF KNOWLEDGE FOR PLANT SCIENCE RESEARCH WELL 
INTO THE TWENTY-FIRST CENTURY. 


TO ALLOW THE UNITED STATES TO STRENGTHEN ITS GLOBAL POSITION IN 
AGRICULTURAL EFFICIENCY AND PROFITABILITY. 


Locating individual genes on chromosomes and _ understanding the 
products of their chemical makeup is now possible on a large scale 
aS a consequence of the new techniques of biotechnology. Through 
gene mapping, unique fragments of DNA are located and assigned to 
a particular chromosome. Through gene mapping, unique fragments 
of DNA are located and assigned to a particular chromosome. This 
technology allows DNA to be fragmented at _ identifiable points, 
recoupled with other segments, grown in_- unlimited quantities 
(cloned) and to be chemically sequenced by individual nucleotide 


base pairs. The ‘restriction enzymes’ used to fragment the DNA 
Snip particular sequences of different lengths. These DNA 
‘snippets’ are referred to as restriction fragment length 


polymorphisms (RFLPs) and have contributed to rapid advancements 
in genome mapping. RFLPs have become key tools in understanding 
the genetics of all types of organisms, especially plants. 
Possession of sequenced plant gene libraries will be fundamental 
to biology and agricultural science of the future. 


In contrast to RFLP maps, physical maps show the actual distances 
between landmarks on chromosomes. RFLP maps, genome sequences, 
and physical maps for individual plant species, when combined with 
fundamental knowledge in plant physiology and biochemistry, take 
on immense importance in agricultural plant science as_ research 
attention turns to genetically engineering important crop = and 
forest species. 


It is evident that the United States must soon begin to encourage 
fundamental scientific efforts in basic biology if it is to remain 
on the cutting edge of scientific achievements in plant science. 
Other nations are aggressively pursuing plant genome mapping 
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Sonmsamal this.. time. For example, the Japanese Government is 
planning to allocate $200 million over the next several years to 
obtaining fundamental information on gene mapping and sequencing 
eeerice. Other nations are planning’ similar initiatives. USDA 
Support for such efforts would be expected to provide fundamental 
Knowledge essential for research into the next century; to support 
agricultural research, education and problem solutions for 
commercial agriculture. 


The conference participants have developed the following report to 
record: the present ‘state-of-the-art’ for plant genome mapping 
and sequencing (Chapter 1!) and information networks designed to 
distribute current knowledge and data for mapping and sequencing, 
(Chapter Il). The report also includes a chapter for each of three 
workshops conducted to develop’ points-to-consider in: Gl eet he 
selection of species; (2) the design of an information network; and 
(3) activities to develop and implement a_=e national initiative in 
plant genome research. 


The present state of knowledge in plant genome mapping and 


sequencing is advancing, but with irregular patterns by 
commodities. There are various reasons’ for these _ patterns 
including technical limitations, biological realities and funding 


constraints. 


Information network technology is highly advanced and very suitable 
to support the needs of a national initiative in plant genome 
research. Present applications in genome data storage and 
retrieval (e.g., GenBank), The Human Genome Research Project and 
other related research activities are reviewed in Chapter ll. 


Of the many points to consider when selecting species of plants for 
genome mapping and sequencing, the following points were identified 
as important: 


O Economic impact and domestic importance. 

O Maximum information transfer to other plant species. 
O Genomic analysis and size. 

O Basic and fundamental insight. 

O Existence of a good knowledgebase. 
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Some of the features of a National Information Network to support 
plant genome research should include the following items: 


O User-friendly. 

O Allow for all types of maps. 
O Be up-to-date. 

O Inexpensive. 


Six important items that could serve as_ activities to initiate a 
plant genome research program are. 


O Develop a clear working definition of the topic. with 
priorities of activities. 

O Specify the advantages of working with plants. 

O Solicit some group grant proposals. 

O Develop some core facilities. 

O Appoint a Plant Genome Research Advisory Committee. 

O Funding should be made available to support plant genome 
research. 


The advantage of undertaking the initiative, as seen by _ the 
conference participants, were: 


O Maintenance of U.S. economic and_ “scientific competitive 
positions. 
O Provide better use of germplasm resources,’ especially 


through biotechnology. 


O Better address the ffragility of natural and managed 
ecosystems through fundamental plant science knowledge. 
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INTRODUCTION 


The Plant Genome Research Conference was convened in Washington, 


D.C. on December 12 - 14, 1988, to address questions related to 
planning an initiative on plant genome mapping and sequencing (The 
Conference Program is presented as Appendix 1). A diverse group 


of plant scientists and database experts from the public and 
private sectors was assembled to review current knowledge, identify 
priorities, develop guidelines, and propose plans for a= major 
initiative to map and sequence the genomes of plant species of 
importance to agriculture and forestry. 


The conference began with comments from Assistant Secretary Orville 
Gy Bentieye whor pointed out “that the USDA has a_ clear-cut 
responsibility to be concerned about plant improvement. He asked 
the conference participants to consider where we are today and to 
communicate needs for building an incremental plan for a _ national 
initiative on plant genome mapping and sequencing. He stated that 
the system should be_~ science-based, well planned and_- should 
incorporate an approach to information gathering and distribution. 


The Conference Chairman, Dr. Robert Faust, introduced the 
participants and invited observers’ (listed in Appendix Il) and 
provided introductory remarks on scientific issues and initiatives. 


The conference then turned its’ attention to the specifics of 
existing and needed knowledge in plant genome mapping = and 
sequencing. The following chapters summarize the conference 
activities as follows: 


Chapter | State-Of-The-Art in Plant Genome Mapping and 
Sequencing 

Chapter Il Present’® Status “of Efforts: to «Store and ~ Retrieve 
Genome Information 

Chapter Ill Criteria for Selecting Plant Species 

Chapter IV Features of a National Network of Information 

Chapter V Activity to Promote Development and Implementation 


The first two chapters represent summaries of information from 
formal presentations made by participants in the conference. The 
remaining three chapters are the result of group discussions using 
the Nominal Group Technique (NGT) of Andre Delbecq. NG Tagisiea 
highly structured group-meeting procedure’ for identifying and 
organizing items related to a particular question. 
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Chapter 1: STATE-OF-THE-ART IN PLANT GENOME 
MAPPING AND SEQUENCING 


Genome mapping has taken on new and exciting activities with the 
use of restriction fragment length polymorphism (RFLP) technology 
which allows detailed genetic mapping not possible a few years ago 
(see Appendix Ill for definitions of Genetic Mapping). Applied to 
plant species, the technology has created rapid advances_ in 
constructing RFLP maps that point to patterns of genomic structures 
and new insights into gene expression and _ function. RFLP 
technology, coupled with DNA sequencing, is expected to converge 
with new information in biochemistry and physiology and lead to new 
understanding of how plants grow and develop and how they can be 
genetically engineered for improved characteristics. 


Convergence of these new technologies is not expected in the 
immediate or short-term future. Considerable progress has been 
made in some crop species while others lag behind. The following 
brief reports highlight the current status of research on_ selected 
commodities and point out some of the advantages and disadvantages 
of conducting genome research on a particular species. 





Maize - RFLP mapping in maize is probably the most advanced of 
all plant species. Five different research groups are developing 
RFLP maps, and about 900 markers have been mapped. Research is 
aisOm DeEINg Undertaken. to unify. the= various © RFLP ~maps; to 
rationalize the conventional and RFLP maps, and to analyze 
quantitative trait loci (QTL). Some work is alsO underway on sugar 
enhancement in sweet corn. 


Soybeans - Several research groups are currently engaged in 
RFLP mapping in soybeans. Both random genomic and cDNA markers are 
being used. RFLP polymorphism is relatively. low among soybean 
Cultivars. Tissue culture has been explored as one method of 
inducing polymorphism, but has not proven successful thus far. 


Cotton - REePeamappingidmecottoncshas; only must, begun: ~RFEP 
mapping has been used for organelle DNA. Private sector research 
with cotton mapping RFLPs has been focused on using the technology 
for varietal protection through DNA fingerprinting. 


Vegetables - Two groups have developed RFLP maps in the tomato, 
and a total of about 700 markers have been mapped. Analysis of 
disease and insect resistance genes and QTL is relatively advanced 
in the. tomato. Other vegetable species receiving attention include 
Brassica, lettuce, potatoes, and peppers, all of which have _ fairly 
detailed RFLP maps. 
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Forest Species -- Genome mapping in forest trees is currently 
restricted to conifers. Tieherm dreetwoe. (Forest™ Service projects 
mimeGeeat  CONSsiructing RFLP maps for Iloblolly pine: one at 
Berkeley, CA, to characterize the organization of pine genomes, and 
one at Gulfport, MS, to isolate fusiform rust resistance genes. 
Statistical methods being developed for mapping and _ identifying 
QTLs in humans will facilitate similar efforts in forest species. 


Barley - RFLP mapping is fairly advanced in barley. Mapping 
has allowed the location of some genes of agronomic importance and 
the identification of genotypes carrying alternate alleles. The 
North American barley breeders have coordinated a mapping effort 
that includes work on deveioping doubled haploids, map 


construction, field studies, cytogenetics, and analysis of QTL. 


Forages - Although much of the genetic work on forages has been 
with grass species, forage is typically composed of both grass and 
legume species. Forage species tend to have complex genomes and 
difficult breeding systems. Breeding objectives tend to vary but 
include improved seed and herbage yield, quality and _ digestibility, 
and pest resistance. Most work on mapping forage species has been 
with alfalfa, with a number of investigators collaborating 
nationally. Additionally, pearl millet mapping has made 
considerable advances based on research being conducted by 
scientists in India. 


Rice - Genome mapping is expected to advance rapidly in rice 
because of a new Japanese Government initiative to map and sequence 
the rice genome ($200,000,000 over a ten-year period). An RFLP map 
with about 200 markers has been developed by a group at Cornell 
University, and is being used to analyze rice disease and _ insect 
resistance genes. 


Wheat - British scientists have a project underway to develop 
ane eEERG mappburethe:. RFLP. probes are proprietary. Sequence 
information is available through research efforts involving genes, 
probes, ribosomes, histones, chlorophyll A/B- binding proteins and 
globulins. Chromosomal locations are known for many phenotypic and 
isogenic markers, and the genetics and cytogenetics of wheat are 
relatively advanced. 
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Chapier Ik *CURRENT STATUS OF EFFORTS-TO "STORE 
AND RETRIEVE GENOME INFORMATION 


The conference included presentations by representatives from _ the 
National Library of Medicine, the National Agricultural Library, 
GenBank, and the National Institutes of Health Office of Human 


Genome Research. These presentations addressed the current state 
of cooperative efforts related to information storage and _ *retrieval 
of genome mapping and sequencing data. Highlights of these 


presentations are given in this chapter. 


Recent legislation (Public Law 100 - 607) has created a _ National 
Center for Biotechnology Information at the National Library’ of 
Medicine (NLM). Funding of $8 million is provided for fiscal year 


1989. NLM will not necessarily be the physical center for all 
biotechnology activities, but will support national needs through 
a variety of activities, including sponsoring conferences and 
workshops, linking information networks, and other planned 
Drolects: 

NLM is developing a Directory of Biotechnology 
Information/Resources, which will be a centralized computerized 
directory of international sources of publicly available 
biotechnology information. NLM is also the lead agency sponsoring 
a series of interagency meetings of a working group on a 
Biotechnology Environmental Release Databank. THE aroupae as 


investigating the need for a database containing information § on 
releases into the environment of biotechnology-altered organisms. 


NLM has developed a prototype computer workstation for genetics 
researchers. Called Geninfo, the workstation includes access to 
a range of computerized sources of information, including GenBank, 
EMBL, the Protein’ Information Resource, X-ray — crystallographic 
data, related bibliographic files, and an on-line copy of 
McKusick’s Mendelian Inheritance in Man. Several investigators at 
the National Institutes of Health are currently testing the system. 


NLM has issued a request for applications for efforts in the area 
of molecular biology related to representation and = analysis’ of 
molecular biology data by computer. Over the next year, NLM will 
also sponsor a series of small conferences of the E. coli genome. 


NLM is working closely with the National Agricultural Library (NAL) 
to ensure that between the two _ libraries all biotechnology 
information is acquired and_ indexed. They are also making sure 
that books and articles containing genome data are indexed in a way 
that they can readily be retrieved from MEDLINE and AGRICOLA (key 
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NLM and NAL databases), and that information is conveyed to GenBank 
in a timely and useful fashion. 


NAL, which serves as both the library of the USDA as well as the 
national library for agriculture, is involved with NLM = and_= other 
agencies in efforts to make biotechnology information readily 
available to USDA = staff and the = entire international research 
community. A Biotechnology Information Center was established in 
fecowrtOmm IOCUS 8SOn othe “provision of: biotechnology resources; 
reference services, special publications, and other needed services 
on this important topic. Particular emphasis has been placed on 
expanding the coverage of biotechnology information in AGRICOLA. 


GenBank, the Genetic Sequence Databank, collects nucleotide 
sequence data and makes those data widely available to the 


international scientific community. The effort provides a computer 
database of all published DNA and RNA’ sequences and “related 
bibliographic and _ biological information. Unpublished data is also 
included. Gurcentiy, ~ RFLP map: information *is not ‘included: 


GenBank is funded through a National Institute of General Medical 
Sciences contract with IntelliGenetics, Inc., which contracts with 
the Department of Energy to have the work done at Los Alamos 
National Laboratory. Funding for GenBank comes from_ several 
government agencies. Data collection iS Carried Out in 
collaboration with the EMBL Data Library and the DNA Bank of Japan. 
Data are available in several machine-readable formats, both 
through GenBank and several secondary sources. GenBank currently 
contains approximately 21,000 entries covering about 25,000,000 
nucleotides from all taxa of organisms. 


Recent changes in the focus of GenBank include efforts to expand 
the database, decentralize the collection of data and gather data 
from a wider range of sources. Strong encouragement is given for 
the submission of data by all researchers. Changes have been made 
to make submission of data as easy as possible and ensure timely 
entry of data. Both USDA and NIH can help by communicating to 
their researchers and grant recipients the importance of submitting 
nucleotide sequence data to GenBank. The ultimate goal is to make 
GenBank data and _ services better and more available to the 
scientific Community. 


The Office of Human Genomic Research at NIH will provide the 
leadership for the national initiative to sequence the human genome 
(3 billion base-pairs), The Office can presently see the outline 
of the project, but the technology is admittedly not in_ place. 
Sequencing base-pairs is a slow and cumbersome process which could 
best be done by machine if the technology can be developed. The 
plan is to use RFLP maps and overlapping clones to develop a 
physical map. 
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The program will become a very large resource of information and 
materials that will be available to the whole biological research 
community. The main goal is human medicine, but the impact of the 
project goes much further. Many can already see great benefits to 
agriculture. The technology is expected to be applicable far 
beyond humans and will certainly impact all of biotechnology. 


An NIH Ad Hoc Advisory Committee has recommended for the Office of 
Human Genome Research the following: 


O Select model organisms with small genomes and assure that 
comparisons with other crganisms will be possible. 

O Work on developing better technology. 

O Mapping and sequencing should be done simultaneously. 

O Investigator-initiated research (not research contracts), 


will result in better ideas and better research. 


O Sharing of data and genetic materials, and cooperation 
between several agencies, organizations and_ laboratories 
Should be a cornerstone of this program. 


The Ad Hoc Advisory Committee also recognized the importance of 
resources to support the research. It is anticipated that up to 
$200 million per year for a ten to fifteen-year period should be 
made available on a scaled-up, funding system. 
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Chapter Ill. CRITERIA FOR SELECTING PLANT SPECIES 


Using the Nominal Group Technique, the conference’ participants 
identified fifty-six considerations related to the question: "What 
are the criteria for selecting crop and forest species for genome 
mapping and sequencing?" Those _ fifty-six considerations were 
evaluated and then relationally associated as: 


1. Economic Impact and Domestic Impact. 

2y Maximum Information Transfer to Other Crop Plants. 
3 Genomic Size and Analysis. 

4. Basic and Fundamental Insights. 

a: Existence of a Good Knowledgebase. 


Of highest interest to conference participants in selecting plant 
species for genome research was the likely economic benefits. 
Related to this issue was the domestic importance of the plant 
species selected for study. 


Also related to economic impact were several items categorized as 
follows: 


O Technological - How will the technology be transferred and 
what will be the short-term benefits of a plant genome mapping 
and sequencing initiative? Consideration should be given to 
avoiding such conflicts’ as ‘big science’ versus ‘small 
science, and to how the _ technology will impact foreign 
markets, less developed countries and world agriculture, 
especially given the opportunities for international 
collaboration. The selection of plant species for regional 


importance may become an important point to consider. 


O Financial - The conference participants raised the question 
of the availability of resources to support a major _ initiative 
in plant genome mapping and sequencing. To obtain full value 
for the money invested, great care should be taken to avoid 
duplication of research effort and to encourage careful use 
of resources. 


O Nutritional - lf) isaevery olikely thatvesomenvof ctheorpractical 
outcomes of a plant genome research initiative will greatly 
improve the nutritional value of harvested plant species. 
This point should be considered in planning the program. 
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O Proprietary Information” - Thought should be _ given to 
intellectual property rights, availability of knowledge and 
sharing of that knowledge within the scientific community. 
Openness in exchanging information should be promoted. 


O Potential for Ecological Impact - Consideration should be 
given to research that will work to improve or protect the 
environment. 


The conference participants identified as a second high interest 
item, the point that species selected for research should provide 
maximum information transfer to other crop plants. This 
consideration emphasizes the need to use resources wisely for 
discoveries which have broad application rather than being limited 
tO specific taxonomic groups. The selected species should serve 
as models within species groups and should collectively represent 
a variety of breeding systems. Consideration should also be given 
to the pool of available investigators and the genetic and 
cytogenetic stocks available for research. The ease of access to 
Knowledge from quantitative studies as well as the potential for 
developing fundamental insights are important. The research should 
maximally contribute to developing basic knowledge and to providing 
some _ scientific ‘leverage’ to extend the knowledge beyond simply 
the direct applications. 


The third high interest item identified by feethe conference 
participants was the _ suitability of a species for genomic. analysis. 
Some consideration should be given to the amenability of the genome 
to analysis and to the size and complexity of the genome. Care 
should be taken to select species that will provide the fastest and 
easiest progress in developing the technology, while at the same 
time providing some taxonomic diversity in species studied. The 
species selected should allow the application of existing 
techniques. The degree of polymorphism should also be considered 
along with ‘the number and availability of genetic and cytogenetic 
stocks, the extent of existing genetic maps, and the _ suitability 
fore GEL study. Consideration should be given to genomic stability, 
generation time, ability to do transposon tagging, and the degree 
of knowledge about genome _ organization. Finally, the extent and 
Status of germplasm collections of the species should be evaluated. 


In addition to the above points, the criteria for selecting plant 
Species for genome mapping and sequencing should include the 
likelihood of developing basic knowledge and fundamental insights 
to support research in basic biology. Some points to consider in 
this aspect include: 
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O Extent of existing knowledge. 


O Degree of taxonomic diversity. 

O Representative breeding systems. 

O Extent of pre-existing genetic maps. 

O Existing quantitative studies. 

O Existing genetic and cytogenetic stocks. 

O Scientific leverage (i.e., information § transfer to 

other species). 

O Exploitation and complementation of existing knowledge 
bases in molecular biology, biochemistry, and 
ohysiology. 

O Knowledge of genome organization. 

O Interesting target genes. 


The final major item identified by the conference participants was 
the consideration that plant species should be selected that have 
a good knowledgebase. This should lead to the fastest and easiest 
progress of the research for obvious reasons. This Knowledgebase 
might be found as preexisting genetic maps, genetic and cytogenetic 
stocks, and germplasm collections. some of the points to ‘consider 
in selecting plant species based on the existence of good 
Knowledgebases are: 


O The ability to transfer existing technology. 

O The possibility for short-term impact. 

O Existing knowledge in biochemistry and physiology. 
O Status of current, similar projects. 

O Predicted ease of success. 

O Track record of information exchange. 


One of the expected benefits to be derived from this approach is 
the contribution to the development of basic knowledge in this 
area. The research should also develop the potential for 
fundamental insights for other activities in research. 
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Points that should be considered when selecting plant species for 
genome mapping and sequencing include the ease of large-scale 
transformation and the = ability to regenerate whole plants’ from 
single cells. The conference participants also addressed _ the 
question of how should the final selection of crops be made; as 
commodities or on _ identified needs for technology development? 
Moreover, there is a risk of a ‘big science’ versus ‘small science’ 
confrontation by focusing resources on a few species and _ the 
possibility of potential duplication of efforts in the private 
sector. Hence, some degree of coordination will be required. 
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Chapter IV. FEATURES OF A NATIONAL NETWORK OF INFORMATION 


In the second working session, the participants of the conference 


used the NGT to address the question: "What are the features of 
a national information network of plant genome data?" Seventy-two 
items were generated. These were then sorted and _ re-evaluated, 
resulting in four major recommendations: 
fine The network should be _ user-friendly with systematic 
Organization for input. 
oe The network should allow for all types of maps, 
quantitative information and raw data. 
33 The network should be kept current with frequent updates, 
and include a mechanism for data validation. 
4. Use of the network should be free or _ relatively 
inexpensive. 


The design of the information system is critically important to the 
network’s success. It is more likely to be used if it is user- 
friendly and has a logical and systematic process of data input. 
Other design considerations suggested include: 


O Reaching early agreement on key words. 


O Emphasizing a uniform, ordered way to input, organize, 
and relate data. 


O Providing editing capability. 


O Allowing for data quality control. 

O Including data from foreign nations. 

O Setting requirements for data validation. 

O Providing background information on crop plants (e.g., 
genome size). 

O Making available the sources of maps and sequence data. 

O Providing easy/unrestricted access. 

O Developing a mechanism to avoid duplication. 
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O Appointing an advisory committee to establish rules. 


O Developing a procedure to collect data in an_ orderly 
fashion; this might include the requirements for 
submission of data by all researchers. 

O Providing for documentation of entries. 

O Supplying gene symbol identification. 


The conference participants made several recommendations concerning 
the organization and structure of the system. The network should 
have a well-defined and = Structured organization, yet be flexible 
in order to accommodate future needs and technological advances. 
The database should be relational in structure to provide access 
to related information. lt. should» handle different. types of 
computer formats and be capable of providing periodic evaluation 
of its resources. 


Conference participants envisioned Overall coordination of a 
network of separate databases, probably at several sites, using the 
Same software, and with standardized features. The network should 
be built on existing networks, to be compatible with them and yet 
not duplicate them. It should use standard, commercially available 
software. Consideration should be given to developing graphic 
displays of information at the user interface. 


The system should ‘gateway’ through and/or be _ interactive’ with 
GenBank and_ other’ established, recognized database sources 
(possibly including GRIN). The National Library of Medicine and 
the National Agricultural Library should be included for related 
activities. 


The source code should be transparent to the user and be capable 


of handling different computer formats. The system should be 
available on-line (and in other formats), with good indexing and 
search options and downloading capability. It should have the 


ability to cross reference between maps and available’ probes, 
analyze linkage data, and permit experimental modeling. 


The system structure must be developed to support a variety of 


complex mapping approaches. Iteshoulderiocus On sprovidingr soit. 
copy’ (not paper) and could serve a valuable function by including 
a bulletin board (with conferencing ability) on topical 
information. 


Inasmuch as The National Information Network for Plant Genome Data 
would predominantly be a service to the research community, the 
mixture of types of information to be stored for retrieval is 
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complex. The demands on the system will be extensive and dynamic. 
Inclusion Of weditierent “types = off maps, some requiring  co- 
presentation for analysis, will require sophisticated approaches. 
Inclusion of raw data is seen as critical for some types. of 
analytical processes and should be given strong consideration. The 
availability of both maps and sequence data in one system is a very 
important feature. References to related, published information 
must also be available on the system. 


Information will need to be stored describing the identified 
function of specific Sequences and _ cross-identified with identical 
genes in different plant species. The combination of different 


types of maps (e.g., phenotypic and RFLP) in one system will be 
especially useful to researchers. 


There will be a need to allow the analysis of data (e.g., linkage 
information) and permit experimental modeling with archived = and 
newly entered data. There should be ae roster of researchers 


identified by different types of categories, and catalogs provided 
of germplasm available to researchers. 


Information should be provided on synteny (i.€., Common order of 
information along chromosomes in different species) and _ provisions 
to allow (indeed encourage) submissions of partial DNA sequences 


to accompany RFLP data. The system should also accommodate 
organelle DNA _ information including cytoplasmic and = mitochondrial 
functions. There should be descriptions of probes that include the 


enzyme and fragment size. 


Currency and validation are two important characteristics of useful 
information systems. Some ways of providing these two important 
dimensions include the appointment of a Research Advisory Committee 
to continually oversee the information network and to set rules for 
participation in the system. Quality control of data input will 
be a critical, ongoing concern. There must be a mechanism 
established for data validation, including the documentation of 
entries. 


One of the ways of maintaining the currency of information is to 
provide listings of related publications and, in the opposite 
direction, trigger refereed journals to provide information to the 
network. Consideration should be given to ‘coercing’ principal 
investigators to submit data to the system in the same way as is 
currently done for Chemical Abstracts and other services. 


Mechanisms will have to be developed to avoid redundancy of 
information within the system and duplication with other networks. 
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In addition to. being user-friendly, inexpensive or free access to 
the system should contribute to its success. 


Some other topics not directly related to the above items were also 
identified: 


O An extended aspect of the information network might be 
the establishment of physical resources to store genetic 
clones as they relate to the information network. 


O The system should be designed to attract multiple funding 
to support the service. 

O Some mechanism should be developed to handle proprietary 
information. 

O The information in the system should somehow _ identify 
data as public or proprietary, including patent 


information on not only the sequence, but also the 
intended commercial uses of the discovery. 


O Security of the system to prevent unauthorized 
modification of data. 
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Chapter V. ACTIVITY.TO PROMOTE DEVELOPMENT AND IMPLEMENTATION 


In its third working session, the conference participants used the 


NGT to address the question: "What activities should the USDA 
undertake to promote the development of a national initiative on 
plant genome research?" The participants generated seventy-five 
items addressing this question. These were then grouped and 
associated into six categories: 
nce The USDA should develop a clear working definition of the 
problem and establish the priorities. 
2. The advantages of working with plant genomes and _ the 
added value of research on plants should be clearly 
spelled out. 


oh The USDA should accept and fund group proposals for 
mapping model species. 


4. some core facilities should be established for plant 
genome studies. 


sy The USDA should appoint a comprehensive Research Advisory 
Committee on plant genome research. 


6. Funding to support activities on plant genome research 
should be made available. 


The first topic identified the need to more clearly define what is 
meant by plant genome research, including mapping and sequencing. 
The objectives of such research activities should be spelled out 
and phrased in a form understandable beyond the _— scientific 


community. An evaluation of the true magnitude of the effort 
should be conducted and the different elements of the task should 
be identified. The priorities for genome mapping and sequencing 


should be developed and the total information ‘package’ should be 
specified. 


Consideration should be given to appointing an ad hoc committee to 
design an organizational structure and, perhaps, even appoint a 
Director of an Office of Plant Genome Research (OPGR). The 
establishment of an OPGR will be important as activities will need 
to be coordinated with other projects, such as the human genome 
effort and for choices that will need to be made at an early date, 
such as which specific technology should be developed or which 
individual crops should be selected. There will be a need to 
target certain information technology and transfer methods. Thus, 
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a Research Advisory Committee could be very helpful in setting out 
these priorities. 


Finally, the early identification of someone to champion the cause 
could be very instrumental in getting the initiative under way. 


The second item identified by the group was the need to develop a 
clear statement on the advantages of conducting research on plant 
genomes. The biological advantages of conducting research with 
plants are _ their exceptional Characteristics of photosynthesis, 
photoperiodism and_ special synthetic and degradative pathways. 
These include the synthesis of essential amino acids and _ the 
giyoxylate shunt for the conversion of fatty acids to sucrose; 
systems which are restricted to plants. 


Other biological advantages of conducting research with plants 
include unique hormones, the presence of apical and lateral 
meristems, the transduction of environmental Signals into 
physiological and morphological responses and the _ synthesis’. of 
complex secondary products, often having great commercial value. 
Finally, plants offer the important research advantage of 
permitting the regeneration of whole organisms’ from _— individual 
cells: 


Some of the genetic advantages of conducting research with plants 
include their reproductive systems which allow the establishment 
of unique individual lines and populations, possibilities for large 
population SIZeS, ease of introducing heritable traits, and 
analytical procedures for complex genetic inheritance. 


There would seem to be a need to develop some examples of the value 
and uses of conducting research with plants to better promote the 
benefits with individuals beyond the plant science community. Some 
of this information might be obtained informally from _ the 
scientific community, or as part of an organized _ international 
meeting on plant genome research. Another approach might be to 
publish a book on the state-of-the-art of plant genome research and 
the expected applications and projected future directions. Another 
Suggestion would be to distribute this conference report and _ other 
information broadly to congressmen, lobbyists, client groups, and 
others. 


Another approach might be to address the consequences of not 


undertaking the research. This includes a diminished competitive 
Position for U.S. “agricultural products in world markets and a 
lessened U.S. competitive position for agricultural science. These 


‘costs of lost opportunities’ could be evaluated as_ projected 
impacts. 
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The third topic identified by the group dealt with organizational 
aspects of the research. It was proposed that the USDA consider 
soliciting proposals from groups of scientists for mapping model 
species. The mechanism for undertaking this activity has not been 
specified but might be as new areas in competitive grants, as 
regional research, or as consortia for special funding. 


Some other activities that would work towards early initiation of 
a national initiative in plant genome research include: 


O Promoting training in quantitative genetics and _ breeding, 
both at the graduate level and for more advanced 
scientists. [Note: Plant breeders should be retrained 


in the new methods and scientists working with the new 
methods could benefit from more experiences in 
quantitative genetics and breeding.] 


O A demonstration proiect should be established to draw 
attention to the needs and ‘show the way’ in plant genome 
research. 

O As a first step, one species should be selected to lay 
out the costs and a timetable for sequencing an _ entire 
plant genome. This might then serve as a basis for 
evaluating the true dimensions of a full national 


initiative ON many crops. 


O Research should be undertaken to demonstrate’ the 
relationships between genetic information and plant 
ohysiology and biochemistry. 


O An interim project of high priority for plant genome 
mapping should be established some time soon within the 
competitive grants program. 


O A focus should be placed upon developing suitable 
information technology and information transfer 
techniques. 

O Attention should be given to mapping whole genomes and 


not just individual genes. 


injemeneedmetoredtunds tor “curating «biologicalescollections «and efor 


funds to support postdoctoral associates was noted. Lhe Sneeceto 
de-emphasize routine laboratory work and stress the development of 
new technologies was also seen as important. Finally, there was 


a recognized need to document present ongoing activities and to 
judge the capacity of current personnel to undertake a _ major 
national initiative in plant genome research. 
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The fourth item identified during this working group session was 
the need to establish core facilities for plant genome _ studies. 
Some thought should be given to the development of data collection 
Genters= that *will™ be ‘either 1) based at’ a central pointy or” 2} 
somehow distributed by crops. Decisions will be needed on how to 
establish priorities for such core facilities and to _ identify the 


resources and organization that would support the effort. There 
may be a need for a central coordinating office for facilities and 
perhaps even the appointment of a core facilities director. This 


would then work toward better coordination of facilities to Support 
a national initiative. 


The conference participants recognized the need for USDA to accept 
the responsibility for being the lead agency for plant genome 
database networking. Procedures should be established for 
protocols and procedures to distribute probes within the scientific 
community and to provide a repository for master genetic stocks. 
Computer specialists should be _ identified to work’ with plant 
biologists in a planned and coordinated way to develop needed 
computer hardware and software. 


Finally, an immense amount of effort will be required to make 
certain that other agencies, and other non-plant efforts, 
(especially those of the NIH Office of Human Genome Research), and 
other public, private and_ international projects are coordinated 
with, and complimentary to, the USDA initiative. 


The next item identified by the conference participants was the 
need to appoint a Research Advisory Committee for a _ national 
initiative On plant genome research. The advisory committee should 
help establish priorities and enlist the endorsement of high 
visibility scientists, other agencies and _ institutions. It should 
help identify the resources needed to conduct the research. The 
advisory committees should also help with central coordination, 
interagency coordination, and with others working in plant genome 
research. The advisory committee should be looked to for advice 
on how funds should be distributed within the scientific community. 
It should suggest mechanisms for program implementation and 
evaluate the success of the program. The advisory committee should 
also assist in obtaining support from agricultural industry, the 
university community and_ scientific societies. It should work to 
promote the contributions of plant genome research for enhancing 
national pride and _ prestige, both within the scientific community 
and with the public-at-large. They should assist in addressing the 
need to document ongoing activities and evaluate the capacity of 
Current personnel to undertake aspects of the _ initiative. Finally, 
the committee should help coordinate activities for the scientific 
community, and establish the various rules, requirements and/or 
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guidelines for data dissemination for information networks such as 
GenBank. 


The dimensions of a national initiative on plant genome _ research 
may ultimately be restricted by financial constraints. Broad-based 
support for the initiative will be critical for not only obtaining 
BeCciaimeronding, sOuUlalsO fOr securing matching funds from the 
private sector. Ehnew extent sols tundingimwill wetenminetather’ degree 
of success as well as the amount of international participation. 
Tnese aspects are considered critical for: a successful national 
initiative in plant genome mapping. 


It was proposed that the USDA consider having an NIH-style training 
grant program in plant genome research. 


A realistic budget estimate should be developed for a= major 
initiative in plant genome research. Consideration should be given 
to the full dimensions of a research program in the long-term, and 
the desirability of providing continued funding for projects for 
extended periods. Thought should also be given to coordinating an 
initiative with The National Agricultural Research Institute 
proposal currently under consideration in the Congress - perhaps 
making this project part of that effort. 


The economic benefits of the proposed research need to be developed 
more clearly and effectively in order to win broad support for an 
initiative. 


To communicate the program needs, the conference participants 
recommended that the USDA sponsor a symposium on plant mapping and 
sequencing plant genomes. This activity, or other such. activities, 
should publicize broadly the accomplishments, plans and expected 
benefits of an initiative. 


Finally, it was a strongly held view of the conference participants 
that the USDA’s Competitive Grants Program should not simply be 
redirected into plant genome research, but should be supplemented 
with new dollars (or funds from other sources) to compliment the 
existing competitive research grant activities. 


The conference’ participants gave some_- attention to _ specific 
activities beyond those of science that would help promote the 


initiative in plant genome research. It was suggested that an 
effort should be made to ‘sell’ the idea to Congress inasmuch as 
significant funding will be required to sustain the activity. Some 


thought should be given to developing a legislative mandate, much 
like the human genome research effort has done with biotechnology. 
There will be a need to communicate the program’s benefits to 
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commodity groups, crop improvement associations, other client 
groups and the public-at-large. 


It was proposed that the USDA _ evaluate the possibilities of 
establishing sometime soon a prototype grants and training program 
to support projects in plant genome research. 


As a final thought, the point was made for a central focus of ideas 
and authority for the initiative. This authority might be achieved 
through the appointment of a Research Advisory Committee, the 
establishment of an Office of Plant Genome Research, or, some other 


mechanism to provide the needed leadership and energy to move an 
initiative forward. 
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APPENDIX I 


CROP AND FOREST GENOME MAPPING CONFERENCE 


Cooperative State Research Service, USDA 
Room 338 Aerospace Building, 901 D Street, S.W. 
Washington, DC 


December 12-14, 1988 


MONDAY, DECEMBER 12 
A.M. 


8:30 WELCOME 


Dr. Orville Bentley 
Assistant Secretary for Science and Education 


U. S. Department of Agriculture 
Washington, DC 


Mr. stan Cath 

Executive Director 
Agricultural Research Institute 
Bethesda, MD 


COMMENTS BY CHAIRPERSON 
Dr. Robert Faust 
Agricultural Research Service 
U.S. Department of Agriculture 
Beltsville, MD 
9:00 WHERE ARE WE TODAY ON PLANT GENE SEQUENCING? 
OVERVIEW: 


9:30 - 9:45 CORN - Dr. D. Hoisington 
University of Missourl 


9:45 - 10:00 SOYBEANS - Dr. B.F. Matthews 
Agricultural Research Service (ARS), USDA 


10:00 - 10:15 COTTON - Dr. R. J. Kohel 
Texas A&M University 


10:15 - 10:30 VEGETABLES - Dr. James Nienhuis 
Native Plants Incorporated 
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10:30 


10:45 


11:00 


11:15 


11:30 


10:45 


11:00 


11:15 


11:30 


11:45 


FOREST - Dr. D. Neale 


Pacific Southwest Forest and Range Experiment 
Station 


BARLEY - Dr. T. K. Blake 
Montana State University 


FORAGES - Dr. David Sleper 
University of Missouri 


RICE - Dr. R. L. Rodriquez 
University of California, Davis 


WHEAT - Dr. Frank Greene 
ARS, c/o National Science Foundation 


LUNCH - "ON YOUR OWN"! 


MONDAY, DECEMBER 12 


P.M. 


1:00 


NOMINAL GROUP TECHNIQUE WORKING SESSIONS 


Dr. David R. MacKenzie 
Cooperative State Research Service (CSRS), USDA 


WORKSHOP #1: WHAT ARE THE CRITERIA UPON WHICH THE USDA 
SHOULD CONSIDER WHEN SELECTING CROP AND FOREST SPECIES 
FOR GENOME MAPPING AND SEQUENCING? 


Co-moderators: 


Dr. David Sleper Dr. David MacKenzie 
University of Missouri CSRS ,USDA 


SESSION SUMMARY 


RECEPTION AND MIXER - LEWIS ROOM, HOLIDAY INN CAPITOL 


DINNER - "ON YOUR OWN" 
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TUESDAY, DECEMBER 13 


A.M. 


8:30 - 


12:00 


CURRENT STATUS OF COOPERATIVE EFFORTS TO COLLECT, 
ARCHIVE AND DISSEMINATE GENOME DATA 


Moderator: 


Mr. Keith Russell 
National Agricultural Library 


Discussant: Dr. James J. Ferguson 
Specialized Information Services 
National Library of Medicine 


Discussant: Mr. Jamie Hayden 
Genbank 
Los Alamos National Laboratory 


Discussant: Dr. Elke Jordan 


Office of Human Genome Research 
National Institutes of Health 


LUNCH - "ON YOUR OWN" 


TUESDAY, DECEMBER 13 


P.M. 


1:30 - 


4:30 - 


4:30 


5:00 


WORKSHOP #2: WHAT ARE THE FEATURES OF A NATIONAL 
NETWORK OF GENE MAPS? 


Co-moderators: 


Mr. Keith Russell Dr. David MacKenzie 


National Agricultural Library CSRS, USDA 


Summary of Session 


WEDNESDAY, DECEMBER 14 
A.M. 


8:30 - 11:30 WORKSHOP #3: WHAT ACTIVITIES SHOULD THE USDA UNDERTAKE 
TO PROMOTE THE DEVELOPMENT AND IMPLEMENTATION OF A 
NATIONAL INITIATIVE IN PLANT GENOME RESEARCH? 


Co-moderators: 


Dr. J. Miksche Dr. David MacKenzie 
ARS, USDA 


CSRS, USDA 


11:30 - 12:00 Summary of Session 


WEDNESDAY, DECEMBER 14 
P.M. 


Session moderators will prepare final draft of 


session summaries for inclusion in report to Science 
and Education, USDA. 
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Dr. A.G. Abbott, Asst. Professor 
Dept. of Biological Sciences 
132 Long Hall 

Clemson University 

Clemson, South Carolina 29634 


Dr. Frank Greene 

Developmental Biology, Division of 
Cellular Biosciences 

National Science Foundation 

1800 G Street, N.W., Room 321 

Washington, D.C. 20550 


Dr. Raymond L. Rodriguez 
Department of Genetics 
University of California 
Davis, California 95616 


Dr. James J. Ferguson 
Specialized Information Services 
Bldg 38-A, Room 35-317 
National Institutes of Health 
National Library of Medicine 
86k00 Rockville Pike 

Bethesda, Maryland 20894 


Dr. Elke Jordan, Director 

Office of Human Genome Research 
Building 1, Room 332 

National Institutes of Health 
Bethesda, Maryland 20892 


Dr. J.P. Van Buijtenen 

Forest Genetics Laboratory 
Texas A&M University 

College Station, TX 77843-2131 


Dr. Gerald Selzer 

National Science Foundation 
1800 G Street, N.W., Rm 312 
Washington, D.C. 20550 


Dr. Scott Tingey 

El duPont de Nemours & Co., Inc. 
Agricultural Products Dept. 
Experimental Station, P.O. Box 80402 
Wilmington, Delaware 19880-0402 
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Dr. S.D. Tanksley 

Dept. of Plant Breeding and Biometry 
Cornell University 

Ithaca, New York 14853 


Dr. R.L. Phillips 

Dept. of Agronomy and Plant Genetics 
University of Minnesota 

St. Paul, Minnesota 55108 


Dr. Robert Fincher 

Department of Biotechnology Research 
Pioneer Hi-Bred Internat’, Inc. 

7250 NW 62nd Avenue 

Johnston, lowa 50131-0038 


Dr. L.C. Hannah 

Interdisciplinary Center for Biotechnology 
Research 

Agricultural, Medical and Life Sciences 

1301 Fifield Hall 

University of Florida 

Gainesville, Florida 32611 


Dre EHCoe, Jr. 

ARS-USDA, Dept. of Agronomy 
Curtis Hall 

Univ. of Missouri 

Columbia, MO 65211 


Dr. Paul H. Sisco 

ARS/USDA, Dept. of Crop Science 
Box 7620 

North Carolina State Univ. 

Raleigh, NC 27695-7620 


Dr. David A. Sleper 
Dept. of Agronomy 
209 Waters Hall 
University of Missouri 
Columbia, MO 65211 


Dr. Jamie Hayden 

GenBank 

T-10, MSK 710 

Los Alamos National Laboratory 
Los Alamos, NM 87545 


Dr. David A. Hoisington 
Dept. of Agronomy 
University of Missouri 
Columbia, MO 65211 


Dr. J.E. Specht 

Dept. of Agronomy 
University of Nebraska 
Lincoln, NE 68583-0915 


Dr. R.L. Lower 

College of Agr. & Life Sciences 
University of Wisconsin 
Madison, Wisconsin 53706 


Dr. T.K. Blake 

Plant and Soil Science Dept. 
Montana State University 
Bozeman, Montana 59717 


Dr. David R. MacKenzie 

Nat. Biol. Impact Assesment 
Program 

Room 330L Aerospace Bidg. 

901 D Street, S.W. 

Washington, D.C. 20251-2200 


Dr. Peter Greening 

International Fund for Agricultural 
Research 

°/o Winrock International 

Rosslyn Plaza 

1611 North Kent Street 

Arlington, VA 22209 


Dr. Robert M. Faust 
USDA/ARS 

National Program Staff 
Bldg. 005, Room 236 
BARC WEST 

Beltsville, MD 20705 


Keith W. Russell 

National Agricultural Library 
US Dept of Agr. 

Beltsville, MD 20705 


Jean Larson 

Biotechnology Information Center 
Room 301, Nat'l Agr. Library 
Beltsville, MD 20705 


Suzanne Nanis 

Biotechnology Information Center 
Nat’l Agric. Library, Room 301 
Beltsville, MD 20705 


Diane Behrens 
ARS/USDA 


Dr. Stanley L. Krugman 

U.S. Forest Service/USDA 
Rosslyn Plaza Bldg. E., Rm 1205 
Rosslyn, Virginia 22209 


Dr. Jerome P. Miksche 
ARS/USDA 

Room 233, B-005 BARC-West 
Beltsville, MD 20705 


Dr. Richard Parry 

ARS/US Dept. of Agriculture 
Room 401, B-005, BARC West 
Beltsville, MD 20705 


Dr. James Nienhuis 

Native Plants Incorporated 
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Saltlake City, UT 84108 
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Federal Grain Insp. Service 
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APPENDIX III 
DEFINITIONS 


Recombinational Maps. 


O 


A 


map defining linkage relationships and meiotic 


recombination distances (in map units or centimorgans) 
between markers. 


A. 


Conventional Maps. 
Maps based on _ morphological, physiological or 
biochemical markers. 


RFLP Maps. 
Maps based on the use of cloned DNA fragments as 
markers. 


Physical Maps. 


O 


A map in which the distance between markers is 
expressed/defined in physical units (kilobase pairs, 
relative distance along meiotic/mitotic chromosomes). 


A. 


C. 


Cytogenetic Maps. 

Maps where the relative position of markers is 
expressed as a distance along either meiotic 
(pachytene) or mitotic chromosomes. 


Restriction Maps. 
Maps where the distance between markers is 


expressed in terms of base pairs. Distances 
determined by one or more of the _ following 
methods: Pulsed field gel electrophoresis, 
restriction mapping, physical linkage. 

Contig Maps 


Maps which describe a_ series of overlapping 
genomic clones (cosmid, YAC, etc.) 


DNA Sequence 


O 


The primary nucleotide sequence of DNA. 
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